45 research outputs found
Bayesian machine learning methods for predicting protein-peptide interactions and detecting mosaic structures in DNA sequences alignments
Short well-defined domains known as peptide recognition modules (PRMs) regulate many important protein-protein interactions involved in the formation of macromolecular complexes
and biochemical pathways. High-throughput experiments like yeast two-hybrid and phage
display are expensive and intrinsically noisy, therefore it would be desirable to target informative interactions and pursue in silico approaches. We propose a probabilistic discriminative
approach for predicting PRM-mediated protein-protein interactions from sequence data. The
model suffered from over-fitting, so Laplacian regularisation was found to be important in
achieving a reasonable generalisation performance. A hybrid approach yielded the best performance, where the binding site motifs were initialised with the predictions of a generative
model. We also propose another discriminative model which can be applied to all sequences
present in the organism at a significantly lower computational cost. This is due to its additional
assumption that the underlying binding sites tend to be similar.It is difficult to distinguish between the binding site motifs of the PRM due to the small
number of instances of each binding site motif. However, closely related species are expected
to share similar binding sites, which would be expected to be highly conserved. We investigated
rate variation along DNA sequence alignments, modelling confounding effects such as recombination. Traditional approaches to phylogenetic inference assume that a single phylogenetic
tree can represent the relationships and divergences between the taxa. However, taxa sequences
exhibit varying levels of conservation, e.g. due to regulatory elements and active binding sites,
and certain bacteria and viruses undergo interspecific recombination. We propose a phylogenetic factorial hidden Markov model to infer recombination and rate variation. We examined
the performance of our model and inference scheme on various synthetic alignments, and compared it to state of the art breakpoint models. We investigated three DNA sequence alignments:
one of maize actin genes, one bacterial (Neisseria), and the other of HIV-1. Inference is carried
out in the Bayesian framework, using Reversible Jump Markov Chain Monte Carlo
Query Training: Learning a Worse Model to Infer Better Marginals in Undirected Graphical Models with Hidden Variables
Probabilistic graphical models (PGMs) provide a compact representation of
knowledge that can be queried in a flexible way: after learning the parameters
of a graphical model once, new probabilistic queries can be answered at test
time without retraining. However, when using undirected PGMS with hidden
variables, two sources of error typically compound in all but the simplest
models (a) learning error (both computing the partition function and
integrating out the hidden variables is intractable); and (b) prediction error
(exact inference is also intractable). Here we introduce query training (QT), a
mechanism to learn a PGM that is optimized for the approximate inference
algorithm that will be paired with it. The resulting PGM is a worse model of
the data (as measured by the likelihood), but it is tuned to produce better
marginals for a given inference algorithm. Unlike prior works, our approach
preserves the querying flexibility of the original PGM: at test time, we can
estimate the marginal of any variable given any partial evidence. We
demonstrate experimentally that QT can be used to learn a challenging
8-connected grid Markov random field with hidden variables and that it
consistently outperforms the state-of-the-art AdVIL when tested on three
undirected models across multiple datasets
Recommended from our members
Molecular Threading: Mechanical Extraction, Stretching and Placement of DNA Molecules from a Liquid-Air Interface
We present “molecular threading”, a surface independent tip-based method for stretching and depositing single and double-stranded DNA molecules. DNA is stretched into air at a liquid-air interface, and can be subsequently deposited onto a dry substrate isolated from solution. The design of an apparatus used for molecular threading is presented, and fluorescence and electron microscopies are used to characterize the angular distribution, straightness, and reproducibility of stretched DNA deposited in arrays onto elastomeric surfaces and thin membranes. Molecular threading demonstrates high straightness and uniformity over length scales from nanometers to micrometers, and represents an alternative to existing DNA deposition and linearization methods. These results point towards scalable and high-throughput precision manipulation of single-molecule polymers
International Journal of Cancer / Synergistic crosstalk of hedgehog and interleukin6 signaling drives growth of basal cell carcinoma
Persistent activation of hedgehog (HH)/GLI signaling accounts for the development of basal cell carcinoma (BCC), a very frequent nonmelanoma skin cancer with rising incidence. Targeting HH/GLI signaling by approved pathway inhibitors can provide significant therapeutic benefit to BCC patients. However, limited response rates, development of drug resistance, and severe side effects of HH pathway inhibitors call for improved treatment strategies such as rational combination therapies simultaneously inhibiting HH/GLI and cooperative signals promoting the oncogenic activity of HH/GLI. In this study, we identified the interleukin6 (IL6) pathway as a novel synergistic signal promoting oncogenic HH/GLI via STAT3 activation. Mechanistically, we provide evidence that signal integration of IL6 and HH/GLI occurs at the level of cisregulatory sequences by cobinding of GLI and STAT3 to common HHIL6 target gene promoters. Genetic inactivation of Il6 signaling in a mouse model of BCC significantly reduced in vivo tumor growth by interfering with HH/GLIdriven BCC proliferation. Our genetic and pharmacologic data suggest that combinatorial HHIL6 pathway blockade is a promising approach to efficiently arrest cancer growth in BCC patients.(VLID)301234
Segmenting bacterial and viral DNA sequence alignments with a trans-dimensional phylogenetic factorial hidden Markov model
The traditional approach to phylogenetic inference assumes that a single phylogenetic tree can represent the relationships and divergence between the taxa. However, taxa sequences exhibit varying levels of conservation, e.g. because of regulatory elements and active binding sites. Also, certain bacteria and viruses undergo interspecific recombination, where different strains exchange or transfer DNA subsequences, leading to a tree topology change. We propose a phylogenetic factorial hidden Markov model to detect recombination and rate variation simultaneously. This is applied to two DNA sequence alignments: one bacterial ("Neisseria") and another of type 1 human immunodeficiency virus. Inference is carried out in the Bayesian framework, using reversible jump Markov chain Monte Carlo sampling. Copyright (c) 2009 Royal Statistical Society.